36 


Comments on Cyber Security in Industrial 
Control Systems and Automation 


JACOB BRODSKY AND JOSEPH WEISS 


OVERVIEW 

Industrial control systems (ICS) are systems of systems. An 
ICS facilitates a physical outcome unlike an office envi- 
ronment, where the information technology (IT) is the end 
product. An ICS relies upon customized, embedded, single- 
purpose computers. These computers typically use older, 
less complex operating systems and have significant longer 
lifetimes. 

The purpose of discussing the contrast between office IT 
and ICS is because most of the expertise in cyber security is 
Office IT oriented. Although the techniques of these experts 
are very much in demand for ICS applications, their priorities 
and their performance assumptions are often in direct con- 
flict with the needs for ICS security and performance. This 
discussion is meant not just for engineers but also for those 
with a background in Office IT. The concepts of ICS cyber 
security are a synthesis and intersection of these two differ- 
ent and technically deep fields. 

A significant difference between ICS and IT is that IT 
focuses on malicious cyber attacks almost to the exclusion of 
unintentional incidents while an ICS needs to address both. 
The U.S. National Institute of Standards and Technology 
(NIST) defines a cyber incident as 

An occurrence that actually or potentially jeopardizes the 
confidentiality, integrity, or availability (CIA) of an infor- 
mation system or the information the system processes, 
stores, or transmits or that constitutes a violation or 
imminent threat of violation of security policies, security 
procedures, or acceptable use policies. Incidents may be 
intentional or unintentional. (FIPS PUB 200, Minimum 
Security Requirements for Federal Information and 
Information System, March 2006.) 

Another difference between these two types of sys- 
tems relates to the age of the embedded systems. Due to 
the expense of designing, implementing, and validating an 
ICS, it is often quite mature when first installed, and then is 
expected to last for as much as 15-20 years of service. 

An ICS is often complex and may contain security flaws 
as part of the system design. Because these computers are 


usually embedded inside subsystems of large complex facili- 
ties, end users are frequently unaware of their existence, let 
alone the actual function they perform. 

The existence of systems that the user is unaware of is 
one of many social issues that are important to the overall 
security posture, but outside the scope of this discussion. 
An engineer needs to understand the design, operation, and 
maintenance issues to properly secure an ICS. 

First, the ICS must be resilient. In other words, if parts of 
it fail, the remaining pieces should continue running in some 
capacity. Outright failure or denial of service in response to 
traffic saturation or unexpected inputs is usually unaccept- 
able. This concept is referred to as Graceful Degradation. It 
is not just a good practice for resiliency sake, but also a good 
security practice to make certain that the effects of a compro- 
mised system do not spread or affect other areas of an ICS or 
impact safety. 

Second, the secured ICS should have additional tools and 
software to detect and remediate worms, viruses, and Trojan 
software. This will be elaborated upon later. 

Third, a secure ICS should contain facilities to assure the 
authenticity or provenance of the data or software. 

These concepts are not that divergent from most IT 
applications. The differences, however, are the threats, the 
priorities, the implementation, and the responses to an unin- 
tentional cyber incident or a malicious cyber attack. 

While ICS attacks are less frequent than Office IT 
attacks, they are more likely to be subject to malice from 
another nation-state, or espionage from another large multi- 
national company, not random hackers. In other words, this 
is a defense against a lower frequency, higher impact attacks. 
The Stuxnet worm is an example of a nation-state type of 
attack. 

When an Office IT system is under attack, because it 
is central to the operation, every activity comes to a halt. 
Restoring from backups is a matter of losing only a relatively 
small amount of work or time. However, with an ICS, it is 
unusual for the process to stop just because the ICS is no 
longer functional. Inertia, pressure, electrical energy, and 
product reactions do not stop simply because the controllers 
are disabled. As a cyber incident can cause physical damage, 


571 


© 2012 by Bela Liptak 



572 Networks , Security, and Protection 


recovery efforts from an ICS failure may entail significant 
physical injuries, environmental damage, even explosive 
destruction — never mind the monetary cost. 

Classic IT security has a tendency to centralize all com- 
puting systems to a few defensible components. An ICS, 
while it does have centralized control components, should 
not become overly reliant upon centrally managed resources. 

A secure ICS should have methods for detection of 
attack. Office IT uses Anti-Virus software (a software black- 
listing method) and Intrusion Protection Systems (which 
may use stateful firewalls instead of ordinary port-based 
firewalls). A significant advantage of an ICS is that they can 
rely upon known applications and IP-addresses to white -list, 
(allow only certain applications to run) because there are few, 
but known, applications running or expected to run. These 
white -listing systems need to be configured to alert operators 
that someone or something is attempting to run unauthorized 
software. 

In general, IT designs and policies tend to rely on con- 
fidentiality, integrity, and availability, in that order. An ICS 
tends to reverse these priorities: availability, integrity, and 
then confidentiality. In an ICS, availability and integrity are 
better described as safety, availability, and integrity. These 
priorities also determine the technologies to be used. Because 
IT has confidentiality as its highest priority, Encryption is a 
priority. However, this is usually not the case for an ICS. 

Partial Practical Examples 

Most IT networks and systems tend to be very dynamic. New 
software, network node additions and subtractions, and new 
protocols are constantly occurring. In contrast, an ICS tends 
to be static. The disappearance of a node or the appearance 
of another is actually a significant event to be noted in an 
alarm log. Because of this, it is still practical and reason- 
able for an ICS to use static IP addresses for Programmable 
Logic Controller (PLC), Human-Machine Interface (HMI), 
and Historian components. 

Similarly, with an ICS, the latency associated with the 
use of Domain Name Servers can often make using one oner- 
ous. Those few milliseconds, it takes, are not usually noticed 
in an office environment, but in an ICS, it can be a significant 
problem. 

The problems most Office IT systems have with keeping 
hundreds of Host files (a static list of IP addresses and their 
domain names kept on each individual computer) updated 
are not usually significant in an ICS, because they do not 
change nearly as much as an office does. Meanwhile, if an 
ICS were configured to use a Domain Name System (DNS), 
the lack of a working DNS could bring the network function- 
ality down, even though everything else may be available. It 
is usually recommended that critical applications using TCP/ 
IP in an ICS should avoid dependence on DNS and use hard 
coded IP addresses or Host files instead. 

In IT, if something takes an extra 500 ms to complete, 
it is not usually significant. In ICS networks that “500 ms” 


could cause a denial of service, and go in to some form of 
emergency-safety mode. 

An ICS can deal with this by making profiles of traffic con- 
nections, and counters. This is the baseline data that should be 
carefully documented when a project goes in to service, and 
with every significant ICS update or change. While most IT 
does not bother to profile networks that closely, one can profile 
an ICS very tightly and set alarms when changes are detected. 

In addition to this, each component of an ICS needs to be 
documented and tracked with version control software. This 
software should include authentication of the components so 
that unauthorized components can be detected. 

As part of a secure ICS, there also needs to be tests of, 
what happens should a component be compromised. The key 
to robust, secure design is to avoid unnecessary dependence 
on other computer resources. Key servers in particular, may 
be an extremely important component of an ICS and thus it 
is equally critical that they should maintain many mirrored 
repositories across the ICS. 

Furthermore, one should not overlook the use of hard- 
wired, non-cyber resources. It may not be practical in every 
situation, but one should never become too reliant upon a net- 
worked ICS, especially for features such as safety. 

The issue of patching has more in common with Office 
IT than we might want to admit. The problem is twofold: 
There is a significant risk associated with leaving unpatched 
embedded systems in service. Yet, the cost of patching is 
not trivial either. This was exactly the complaint that many 
Office IT managers made over a decade ago. 

The difference is that many offices have grown used to 
the idea of spraying patches everywhere and hoping for the 
best. This notion does not work in an ICS environment and 
the authors make a very strong recommendation to test all 
patches before deployment — even if the ICS vendor says it 
will probably work. 

For example, a PLC rated as a Safety Instrumented 
Function (SIF) component might be vulnerable to a classic 
LAND attack. If one were not to address this vulnerability, it 
could be compromised by sending a single malicious packet 
that could cause the embedded processor to go in to an infi- 
nite loop. 

While it is clearly unacceptable to have a cyber-vulnerable 
PLC in a safety system, one cannot simply patch it blindly. 
The entire safety system needs to be validated and the new 
software and firmware must be able to demonstrate that it is 
safe and no longer contains this vulnerability. 

The latter demonstration is where the ICS user commu- 
nity can take a page from the Office IT experience. While 
many people decry websites such as Metasploit.com for host- 
ing software scripts that demonstrates a particular vulner- 
ability in public, such databases do have a use. They can be 
used to prove that whatever fix the ICS vendor came up with, 
actually does address the problem. The scripting language 
and the format of the report may be useful for a semi-private 
database, where users and ICS vendors can gather and share 
information on how to build better products. 


© 2012 by Bela Liptak 



36 Comments on Cyber Security in Industrial Control Systems and Automation 573 


Naturally, patching is more difficult in an ICS than in 
Office IT applications. However, the frequency of the risk 
and the structure of the networks may enable users to buy 
time to fix the problem. 

To help with this, ICS networks should be organized into 
various zones with stateful firewalls between them acting as 
a conduit. The stateful treatment of packets by the firewall is 
important. 

Conventional firewalls address port numbers alone. For 
example, a Modbus/TCP packet is expected to appear on 
port 502. If one wanted to block the Modbus/TCP packets 
through a conventional firewall, it would simply look for port 
502 traffic and block it. 

However, if one wanted to send that Modbus/TCP traf- 
fic to the outside through a conventional firewall, it would 
not be difficult to put the traffic on, say, port 80 that might 
be open for web page traffic. A conventional firewall would 
not “care” that the Modbus traffic is on port 80. It only pays 
attention to the port number and assumes that it can only be 
what it claims to be. 

Thus, one could export data and control opportunities 
using non-standard port numbers that might be open through 
the firewall. However, with a stateful firewall, the firewall 
actually looks at the traffic and tries to see if it is behav- 
ing the way it is expected to. Modbus/TCP traffic does not 
look like Hyper-Text Transport Protocol (HTTP). A Stateful 
firewall would notice this and prevent the traffic from going 
through. 

As noted in ISA-S99, zones and conduits are the core 
concepts to securing an ICS network. Zones should be as 
small as practical. Conduits are firewalled communications 
ports between zones. They should be designed to pass only 
the traffic that must pass and they should also be designed to 
monitor traffic volumes on both sides of the firewall. 

One of the key self integrity monitoring systems should 
be the traffic volume counters. Often people will make inad- 
vertent mistakes by accidentally cross-connecting an office 
segment with an ICS segment. The sheer volume of broad- 
cast/multi-cast traffic on most office networks can saturate 
many older network segments of an ICS. This is what could 
have occurred at the Browns Ferry Nuclear Unit 3 power 
plant [1] when the operators lost control of the reactor water 
cooling pumps and were forced to SCRAM the reactor. 
Unfortunately, like most ICS cyber incidents, cyber forensic 
data were not available. 

Early warnings and diagnostic traffic data from conduit 
firewalls could be of significant use toward building better 
resilience. 

Bandwidth concerns are more commonplace than one 
might think. All network media, be it serial buses, wireless 
links, WAN connections, or fiber-optic cables, should have 
both lower and upper limits for normal traffic profiles and 
there should be alarms. 

One approach often frowned upon by ICS engineering 
staff are the network scanning activities often used by office 
IT workers inexperienced with ICS systems. Some will use 


utilities such as “nmap” to scan an ICS network segment. 
Such scans are helpful in finding new or undocumented net- 
work nodes or services. There is nothing inherently wrong 
with scanning like this except that the default scan rates for 
this utility are set to dangerously high traffic volumes for a 
typical ICS. Consequently, these high traffic volumes have 
often shut down legacy ICS components such as PLCs. 

Though it is not an appropriate behavior, many embed- 
ded devices such as PLC Ethernet interfaces may spend too 
much time attempting to process the flood of traffic, fail to 
reset a watch dog timer within the prescribed number of mil- 
liseconds, and then reset, causing still more packets to be 
ignored. 

IT staff need to be aware of these limitations and keep 
traffic levels within certain prescribed limits determined by 
offline experiments. 

Access control to various network entities is usually done 
by passwords in most offices. In an ICS, however, password 
policies are often toxic. People tend to forget passwords 
under stress; there are too many to remember and writing 
them down defeats the purpose of having one. 

IT Security will point out that access control can be any 
of the three things: something you know, something you are, 
or something you have. Passwords represent the first of these. 
One could easily use access cards (something you have), or 
biometrics such as fingerprint or retina scanners (something 
you are). Modern operating systems are getting better at 
using biometrics and access cards. 

When designing networks, the engineer must take note of 
recovery-times when a link fails. For example, spanning tree 
protocol, used between switches to determine how to route 
packets, needs a certain period of time to determine when a 
link is dead and then how it is going to restore connections. 

The IT staff needs to be aware of the requirements for 
the protocols in an ICS, otherwise they will be tempted to 
use common Office IT-oriented default settings. Oversight 
of these parameters often leads to needless friction between 
communities. 

Once a suitable network has been designed, however, it 
should be scanned regularly in search of unauthorized ser- 
vices and/or addresses (at the lower traffic rates suitable to 
the system). 

Before deployment on a network, a new embedded device 
should be fully characterized, not just for compatibility, but 
also for vulnerabilities. Scanning it on a test bench LAN, iso- 
lated from everything else with “nmap” to determine what 
ports are open, and whether this matches what the documen- 
tation says, is simple and inexpensive. It is not uncommon for 
vendors to leave diagnostic back doors in the field equipment 
that may make the equipment vulnerable to attack. If fuzzing 
software and equipment is available, such tests can reveal the 
robustness of the code in the device. Your IT security experts 
can assist with explanations of the implications of these tests. 
These tests are especially meaningful, if one is testing an 
SIF-rated instrument or PLC with firmware or networking 
of any sort. 


© 2012 by Bela Liptak 



574 Networks, Security, and Protection 


Once the vulnerabilities are known (and there will be 
issues with almost all devices), one can deploy them behind 
firewalls and intrusion-detection software. Knowing what 
the vulnerabilities are, one can have some idea of what attack 
traffic might look like. Alarms should monitor attempted 
attack efforts. 

The latter is especially important because many incidents 
happen with little or no documentation or even alarms when 
things are awry. Odd things start to happen with the physi- 
cal process and people often shrug, and go on about their 
business. 

In general, this leads to the lack of forensics data in ICSs. 
Few are making much headway here. However, some guide- 
lines are appropriate. First, the data often exists. The problem 
is that it may not be available in a form that the operators or 
even the engineers can read. Many network equipment man- 
ufacturers include SCADA-like features that track the behav- 
ior of a switch, router, or even a PC using Simple Network 
Management Protocol (SNMP). 

Some SCADA systems are starting to collect this SNMP 
protocol traffic, however, few seem to understand the sig- 
nificance of what one can see with SNMP. For example, if 
someone connects into a switch, an SNMP trap (spontane- 
ous report) can be made to a designated computer with a 
trap receiver process. An unexpected jack-in from a laptop 
nobody recognizes is a significant security event. It should be 
logged and alarmed for the operators so they know where to 
look for the perpetrator. 


Conversely, an embedded device that suddenly stops 
giving a link indication is helpful for diagnostics. 

Other miscellaneous issues: Safety systems should be 
segmented from the rest of the ICS to the extent possible. 
For example, if someone hits an E-Stop in the middle of a 
process, all the upstream systems will need to be shut down 
safely to prevent making the problem bigger and potentially 
more dangerous. The preferable method is to use instrumen- 
tation to detect the E-Stop independently of the safety sys- 
tem and then to use that instrumentation to signal for a wider 
area shut down. It is imperative that whatever systems are 
used, even a denial of service attack against the safety system 
should not inhibit proper, reliable operation. 


CONCLUSIONS 

ICS are functionally, operationally, and technically differ- 
ent than IT systems. Cyber security should be addressed 
accordingly. 


Reference 

1. U.S. Nuclear Regulatory Commission Information Notice: 
2007-15: Effects of Ethernet-based, Non-safety Related 
Controls on the Safe and Continued Operation of Nuclear 
Power Stations, April 17, 2007. 


© 2012 by Bela Liptak 



